On the mutual nearest neighbors estimate in regression

نویسندگان

  • Arnaud Guyader
  • Nicolas W. Hengartner
چکیده

Motivated by promising experimental results, this paper investigates the theoretical properties of a recently proposed nonparametric estimator, called the Mutual Nearest Neighbors rule, which estimates the regression function m(x) = E[Y |X = x] as follows: first identify the k nearest neighbors of x in the sample Dn, then keep only those for which x is itself one of the k nearest neighbors, and finally take the average over the corresponding response variables. We prove that this estimator is consistent and that its rate of convergence is optimal. Since the estimate with the optimal rate of convergence depends on the unknown distribution of the observations, we also present adaptation results by data-splitting. Index Terms — Nonparametric estimation, Nearest neighbor methods, Mathematical statistics. 2010 Mathematics Subject Classification: 62C10, 62F15, 62G20.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Kernel and Mutual $k$-Nearest Neighbor Regression

We propose Bayesian extensions of two nonparametric regression methods which are kernel and mutual k-nearest neighbor regression methods. Derived based on Gaussian process models for regression, the extensions provide distributions for target value estimates and the framework to select the hyperparameters. It is shown that both the proposed methods asymptotically converge to kernel and mutual k...

متن کامل

Mutual Information and Conditional Mean Prediction Error

Mutual information is fundamentally important for measuring statistical dependence between variables and for quantifying information transfer by signaling and communication mechanisms. It can, however, be challenging to evaluate for physical models of such mechanisms and to estimate reliably from data. Furthermore, its relationship to better known statistical procedures is still poorly understo...

متن کامل

A Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors

Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...

متن کامل

ERAF: A R Package for Regression and Forecasting

We present a package for R language containing a set of tools for regression using ensembles of learning machines and for time series forecasting. The package contains implementations of Bagging and Adaboost for regression, and algorithms for computing mutual information, autocorrelation and false nearest neighbors.

متن کامل

Estimation of Density using Plotless Density Estimator Criteria in Arasbaran Forest

    Sampling methods have a theoretical basis and should be operational in different forests; therefore selecting an appropriate sampling method is effective for accurate estimation of forest characteristics. The purpose of this study was to estimate the stand density (number per hectare) in Arasbaran forest using a variety of the plotless density estimators of the nearest neighbors sampling me...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2013